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Abstract —We consider distributed convex optimization prob¬ 
lems that involve a separable objective function and nontrivial 
functional constraints, snch as Linear Matrix Ineqnalities (LMIs). 
We propose a decentralized and computationally inexpensive 
algorithm which is based on the concept of approximate pro¬ 
jections. Our algorithm is one of the consensus based methods 
in that, at every iteration, each agent performs a consensus 
update of its decision variables followed by an optimization 
step of its local objective function and local constraints. Unlike 
other methods, the last step of onr method is not an Euclidean 
projection onto the feasible set, but instead a subgradient step 
in the direction that minimizes the local constraint violation. 
We propose two different averaging schemes to mitigate the 
disagreements among the agents’ local estimates over a time- 
varying sequence of strongly-connected digraphs. We show that 
the algorithms converge almost surely, i.e., every agent agrees 
on the same optimal solution, under the assumption that the 
objective functions and constraint functions are nondifferentiable 
and their snbgradients are bonnded. We provide simnlatlon 
results on a decentralized optimal gossip averaging problem, 
which involves SDP constraints, to complement our theoretical 
results. 

I. Introduction 

D ecentralized optimization has been extensively 
studied in recent years due to a variety of applications 
in machine learning, signal processing, and control for robotic 
networks, sensor networks, power networks, and wireless com¬ 
munication networks ffl-H. A number of problems arising 
in these areas can be cast as distributed convex optimization 
problems over multiagent networks, where individual agents 
cooperatively try to minimize a common cost function over 
a common constraint set in the absence of full knowledge 
about the global problem structure. The main feature of 
carrying these optimizations over networks is that the agents 
can only communicate with their neighboring agents. This 
communication structure can be cast as a graph, often directed 
and/or time-varying. 

The literature on distributed optimization methods is vast 
and involves first-order methods in the primal domain, the dual 
domain, augmented Lagrangian methods, or Newton methods, 
to name a few. Here we discuss methods that are most closely 
related to the method under consideration. Among those, 
one of the most well-studied techniques are the so called 
consensus-based optimization algorithms a-ca (see also 
the literature for the consensus problem itself Q, Ell), 
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where the goal is to repeatedly average the estimates of all 
agents in a decentralized fashion in order to obtain a network¬ 
wide consensus. Between the averaging steps, each agent 
usually performs a single local optimization step. Overall, the 
agents use their local information to cooperatively steer the 
consensus point toward the optimal set of the global problem. 

Decentralized algorithms that fall in this class of methods 
can be distinguished based on which averaging scheme or 
optimization method is used, and in which space (primal or 
dual) the iterates are maintained. All these algorithms often 
require expensive optimization steps or exact projections on a 
complicated constraint set at every iteration. Such intensive 
computations, however, require time and may shorten the 
lifespan of certain systems, such as wireless sensor networks 
or robotic networks. 

In this work, we propose a new approximate projection 
based decentralized algorithm and prove its convergence. Our 
work in this paper is an extension of the author’s previous work 
ll28ll . Specifically, we use the same local information exchange 
model and gradient descent algorithm as in ll28ll . but a different 
projection method motivated by the work in ll2^ . In contrast 
to ll28ll . our contribution can be summarized as follows: (1) 
Instead of using the Euclidean projection, we approximate it 
by measuring the constraint violation and taking a subgradient 
step minimizing this violation; (2) We show convergence under 
milder assumptions. Specifically, we remove the smoothness 
assumption in the objective functions; (3) We propose two 
different averaging schemes to mitigate the disagreements 
among the agents’ local estimates, one of which can lift the 
doubly stochasticity assumption on the weight matrices. 

Considering that projections have a closed form solution 
only in a few special cases of constraints, our new algorithm 
is more general and can be applied to a wider class of 
problems including Semidefinite Programming (SDP), where 
the constraints are represented by Linear Matrix Inequalities 
(LMIs). It is well known that even finding a feasible point that 
satisfies a handful of LMIs is a difficult problem on its own. 
The work in this paper is also related to the centralized random 
projection algorithms for convex constrained optimization 
and convex feasibility problems ED- Other related works are 
ES-Ell, where optimization problems with uncertain con¬ 
straints have been considered by finding probabilistic feasible 
solutions through random sampling of constraints. 

The paper is organized as follows. In Section [III we 
formulate the optimization problem under consideration and 
discuss specific problems of interest. In Section|IIIl we provide 
our decentralized algorithm based on random approximate 
projections, discuss the communication scheme employed by 
the agents, state assumptions and the main results of this paper. 
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In Section IIVI we first review some necessary results and 
lemmas from existing literature, provide proofs of required 
lemmas, and then present the proofs of the main results 
discussed in Section |III] In Section |V] we present simulation 
results for a decentralized SDP problem, which is optimal 
decentralized gossip averaging. We conclude the paper with 
some comments in Section |VT] 

Notation: All vectors are viewed as column vectors. We 
write to denote the transpose of a vector x. The scalar 
product of two vectors x and y is (x, y). For vectors associated 
with agent i at time k, we use subscripts i, k such as, for 
example. Pi,fc, Xi^k, etc. Unless otherwise stated, H-H represents 
the standard Euclidean norm. For a set S, we use IS”! to denote 
its cardinality. For a matrix A £ we use [A]ij to denote 

the entry of the i-th row and j-th column and || Ajlj’ to denote 

/ \ 1/2 

the Frobenius norm ||A||i^’ = (X]rj=i([^]v)^) ■ tise 

TrA to denote the trace of A, i.e., TrA = 
denote by Sm and the space of m x m real symmetric 
and real symmetric positive semidefinite matrices, respectively. 
The matrix inequality A A 0 means —A is positive semidefi¬ 
nite. We use 1 and 0 to denote vectors of all ones and zeros. 
The identity matrix is denoted by /. We use Pr{Z} and P.[Z] to 
denote the probability and the expectation of a random variable 
Z. We write dist(x, A) for the distance of a vector x from a 
closed convex set X, i.e., dist(x, X) = min^gAr ||u — x\\. We 
use nA:’[x] for the Euclidean projection of a vector x on the set 
X, i.e., n;t[x] = argmin^g;(^ ||x — x|p. We often abbreviate 
almost surely and independent identically distributed as a.s. 
and Lid., respectively. 

II. Problem Definition 

Consider a multiagent network system whose communica¬ 
tion at time k is governed by a digraph Qk = (V,£fc), where 
V = {1,..., iV} and C V x V. If there exists a directed 
link from agent j to i, which we denote by (j,/), agent j 
may send its information to agent i. Thus, each agent i G V 
can directly receive information only from the agents in its 
in-neighborhood 

Ml% = {jGV\{j,i)G£k}^{i}, (1) 

and send information only to the agents in its out- 
neighborhood 

={/e V| (2) 

where in both and we assume there exists a self¬ 

loop {i,i) for all i G V. Also, we use di{k) to denote the 
in-degree of node i at iteration k, i.e., 

d^{k) = W'^kV ( 3 ) 

A. Problem Statement 

Our goal is to let the network of agents cooperatively solve 
the following minimization problem: 

mm /(x) = ^/,(x) (4) 

igV 

s.t. xGX, X^Xod {r\i^vXi) , 


where only agent i knows the function fi : R" —R and the 
constraint set Xi C R". The set Xq C R" is common to all 
agents and assumed to have some simple structure in the sense 
that the projection onto Xq can be made easily (e.g., a box, 
ball, probability simplex, or even R"). Note that the common 
constraint set, i.e., Xq = Xi for all i £ V, is a special case 
of this problem definition. We assume that the set of optimal 
solutions X* = argmin^g;^- /(x) is nonempty. 

We assume each agent i’s local constraint set Xi consists 
of one or more algebraic inequalities, which we denote by 

Ai = {x £ R" I g{x,uj) <0, Vw £ 11^}, 

where fli is a finite collection of indices. From this definition, 
the feasible set X can be precisely represented as 

X = {x G Xq\ g{x,uj) <0, Vw £ fli,i G V}. 

Note that some of the inequalities may overlap across different 
agents, i.e., flinUj for i ^ j can be either empty or nonempty. 

We also consider an equivalent epigraph form of problem 
(Ell by introducing a new set of variables t = [ti ... £ 

R^. Consider that the local constraint set Xi for * £ V now 
includes the additional inequality constraint fi{y) < ti. Then, 
problem (El is equivalent to: 

min X (5) 

X 

s.t. yGX, X^XqH (n.evA’.) , 
where x = \y^ and a = [0^ 1^]^- 

B. Problems of Interest 

Problems of particular interest are those involving lots of 
nontrivial constraints on which exact projections are impossi¬ 
ble or computationally intractable. Here we provide two such 
examples: 

1) Robust Finear Inequalities: 

Ai = |x £ R" I A{ijj)x < b{uj), Vw such that 

||A(a;) - Aollop < n and ||6(w) - 6o||op < r 2 |, 

where Aq G bo G R™ are nominal data, rp, r 2 > 

0 are the levels of uncertainty, and || • ||op denotes an 
operator norm. Here we can not handle each row of 
A(a;)x < b{u!) separately as in due to the matrix 
operator norm || • ||op . 

2) Finear Matrix Inequalities: 

n 

A’i = |x £ R" |Ao(a;) -l-^XjAj(w) ^0, Vw £ 

(6) 

where Aj(uj) G Sm for j = 0,l,...,n, w £ fli 
are given matrices. The inequalities in (|6]l are referred 
to as linear matrix inequalities (FMIs). A semidefinite 
programming (SDP) problem has one or more FMI 
constraints. Finding a feasible point of the set (01 is 
often a difficult problem on its own. 

Note that the inequalities in (0i can represent a wide variety 
of convex constraints (see llTSl for more details). For example. 
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quadratic inequalities, inequalities involving matrix norms, and 
various inequality constraints arising in robust control such as 
Lyapunov and quadratic matrix inequalities can be all cast as 
LMIs in (|6]). When all matrices Ajioj) in (|6]l are diagonal, the 
LMIs reduce to regular linear inequalities. 

III. Algorithm, Assumptions, and Main Results 

Our goal is to design a decentralized protocol by which each 
agent i G V maintains a sequence of the local copy {xi^k}k>o 
converging to the same point in X* as k goes to infinity. Since 
we assume that the local constraint sets A^’s are nontrivial, 
we do not find an exact projection onto Xi at each step of 
the algorithm. Instead, at iteration k, each agent i randomly 
generates an index G VLi and makes an approximate 
projection on the selected inequality g{-,uii^k) < 0. 

A. Decentralized Algorithm with Approximate Projections 

We formally present our decentralized algorithm, named the 
Decentralized Approximate Projection (DAP), in Algorithm[T] 
Each agent i maintains a sequence {xi^k}k>o- The element 
Xi^k of the sequence can be seen as the agent i’s estimate of 
the decision variable x at time k. Let g~^(x,ijj) denote the 
function that measures the violation of the constraint g{-,oj) 
at x, i.e., g'^{x, uj) = max{g(a:, w), 0}. 


Algorithm 1 Decentralized Approximate Projection (DAP) 
Let Xi^Q € Xq for i G V and the nonnegative parameter 
{ak}k>o be given. 

Set k := 1 

while Maximum iteration number is reached do 
Each agent i updates Xi^k according to 


Pi,k — 

^ ' [A^k]ijXj^k-l 

(7a) 


jev 



Vi,k = 

r\Xa 

Pi,k f^k^i^k] 

(7b) 

Xi,k — 

n^o 

g^ivi^k,(^i.k) , 

Xik ||j 112 dik 

[ \\d^,k\\ J 

(7c) 

= k + 1 





end while 


At /c = 0, the estimates Xi^o are locally initialized such 
that Xi^o S Xq. At time step k, all agents j G V broadcast 
their previous estimates Xj^k-i to all of the nodes in their 
out-neighborhood, i.e., to all agents i such that {i,j) G £k- 
Then, each agent i G V updates Xi^k using iTali-dT^. where 
Wk is a nonnegative N x N weight matrix, {afc} is a positive 
sequence of nonincreasing stepsizes; ^ is a subgradient of 
the function fi at piy, tUi^k is a random variable taking values 
in the index set Up, and di^k is a subgradient of g~^{-,uji^k) 
evaluated at Vi^k- The vector di ^ is chosen such that di^k G 
dg^{vi^k,i^i,k) if g^{vi,k,^i,k) > 0, and di^k = d for some 
d ^ 9^ (Xi^k^ ^i,k^ — 0 - 

More specifically, in (fTal i. each agent i calculates a weighted 
average of the received messages (including its own mes¬ 
sage Xi^k-i) to obtain pi^k- Specifically, > 0 is the 

weight that agent i allocates to the message Xj^k-i- This 


communication step is decentralized since the weight matrix 
Wk respects the topology of the graph Qk, i-e., > 0 

only if {j,i) G £k and = 0, otherwise. In (fTbl i. 

each agent i adjusts the average pi^k in the direction of the 
negative subgradient of its local objective fi to obtain Vi^k- 
The adjusted average is projected back to the simple set Xq. In 
(|7^ . agent i observes a random realization of oji^k G fli and 
measures the feasibility violation of the selected component 
constraint g{-,u}i^k) at Vi^k- If 9'^{xii^k,^i,k) > 0, it calculates 
a subgradient di^k G dg^{vi^k,<^i,k) and takes an additional 
subgradient step with the stepsize ^ niinimize 

this violation. If g'^{vi^k,(^i,k) = 0, then the current point 
Vi^k already satisfies the selected inequality g{-,uJi^k) < 0. In 
this case, there is no need to move the point further into the 
selected set. Therefore, the approximate projection step d?^ 
is just omitted. 

Note that the description of the DAP algorithm is only 
conceptual at this moment since we have not specified the 
parameters {afe} and {Wk} yet. The stepsizes {ak} should 
be nonnegative, nonincreasing and such that 

OO OO 

Uk = OO and af. < oo. (8) 

k=l k=l 

Eor the sequence of weight matrices {Wk}, we assume the 
following. 

Assumption 1: For all k>l, 

(a) > 0 for all i,j G V and [IEfc]y = 0 only if 

(b) There exists a scalar v G (0,1) such that > v 

only if j G A/J^. 

(c) ^ 1 

for all j G V. 

Condition (a) ensures that the weight matrices Wk respects the 
underlying topology Qk for every k so that the communication 
is indeed decentralized. The lower boundedness of the weights 
in (b) is required to show consensus among all agents (see IMIl 
for more details) but the agents need not know the v value in 
running the algorithm. Condition (a) and (c) imply doubly 
stochasticity of the matrices Wk- 

Here assuming doubly stochasticity of Wk for all fc > 1 
might be too strong as finding such a Wk usually requires 
a global view (unless the underlying graph is regular or 
fully connected) and not all directed graphs admit a doubly 
stochastic matrix lE?]. We can lift this limitation by defining 
the weights as the following: Eor all k > 1, 

where di{k) is defined in (EJ. Note that this choice of weights 
also respects the underlying topology Qk for every k. More¬ 
over, the matrices Wk are row stochastic by construction, but 
not necessarily column stochastic. 

Recall that our problems of interest involve a large number 
of constraints. Therefore, the random selection of a constraint 
in (|7^ serves as a computational efficient alternative to finding 
the most violated constraint, which typically has significantly 
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higher per-iteration complexity. Another situation that neces¬ 
sitates the random selection approach is when the constraints 
are not fully given in advance, but are rather revealed in a 
sequential fashion (as in online optimization). 

Note that the step (iTcl i guarantees that Xi^k S Aq for 
all k > 0 and i G V, but it does not necessarily guar¬ 
antee Xi fe € X. In Section |IV] we show that Xi^k for all 
i G V asymptotically achieve feasibility nevertheless, i.e., 
linifc^oo \\xi,k - ^x[Pi,k]\\ = 0 for ^ 

To further explain the step (iTcl i. let us consider the two 
particular cases mentioned in Section III-BI 

1) Let c+ denote a projection of a vector c G K™ onto the 
nonnegative orthant. We introduce a scalar function in 
order to handle all the rows of the inequality A(w)x < b 
concurrently, 

g+{x,uj) = ||(A(w)x- 6 )+|| 


the eigenvalue decomposition and 0{rn?) for the computation 
of the Frobenius norm (cf. Eq. (fTOli). We would also need 
0{m?n) computations for computing the traces (cf. Eq. (fTTI) ). 
This eigenvalue decomposition is necessary for projection (or 
approximate projection) onto the cone of positive semidefinite 
matrices. ■ 

It is also worth mentioning that the algorithm (I7ali-(l7cb 
includes the method that has been proposed in 1 ^ as a special 
case. In order to see this, let Xq = M" and g{x,uji^k) = 
dist(x, where X^'’’’ = {x G R” | g{x,uJi^k) < 0}. 

Then, it is not difficult to see that 

Vi,k - [vi^k] 

Iki.fc - ’ 

and since dist(ui,fc, [ui,fc]||, the steps 

(ITbl i-dTcli reduce to 


which is convex in x for any given uj G fti and i G V. 
Then, it is straightforward to see that its subgradient can 
be calculated as 


A{uj)^ {A{uj)x — b) 




||(A(cu)x-&)+|| 


if g'^{x,uj) > 0 , and dg'^{x,uj) = d for some d ^ Q, 
otherwise. 

2) Let us define the projection A+ of a real symmetric 
matrix A onto the cone of positive semidefinite matrices. 
Eor any A G S™, we can find an eigenvalue decomposi¬ 
tion A = BAB^, where B is an orthogonal matrix and 
A = diag(Ai,..., Am). Then, its projection is given by 


A+ = BA+B^, 


where A+ = diag(Aj'',..., A+j with A)^ = max{0, A^} 
ll38l . Let us define 

n 

A{x, uj) = Ao(uj) + ^ XjAj(uj). 

4 = 1 

Then, the amount of violation of the corresponding 
LMI constraint A(x^uj) A 0 can be measured by the 
following scalar function: 

5+(x,a;) = ||A+(x, u;)||f- (10) 


By direct calculations, it is not difficult to see that its 
subgradient is given by 


5p+(x,a;) 


1 

g+{x,uj) 


/ TrAiA+(x,a;) 
\ TrA„A+(x,u;) 


( 11 ) 


if g~^{x,uj) > 0 , and dg~^{x,uj) = d for some d 7 ^ 0 , 
otherwise. 


Remark Note that the computational complexity of step (TTcT i 
depends on the type of the function g{-,u}i^k)- If 
is a general convex function, it takes 0 ( 1 ) computations 
for the evaluation of g~^{-,uji^k) and 0 {n) computations for 
the evaluation of the gradient d^ fc. If g{-,u!i^k) is an LMI 
constraint, it takes 0{m^) computations in the worst-case for 


Xi^k — ^kSi,k]: 

which is exactly the algorithm in (|28l. 

B. Assumptions 

Eor the optimization problem (HJi, we make the following 
assumptions on the set Xq, the objective functions fi{x) for 
i GV, and the constraint functions g(x^uj) for lu G Hi and 
i G V. 

Assumption 2: 

(a) The set Xq is nonempty, closed and convex. 

(b) The function f (x), for each i gV, is defined and convex 
(not necessarily differentiable) over some open set that 
contains Xg. 

(c) The subgradients s G dffix) are uniformly bounded 
over the set Xq. That is, for all i GV, there is a scalar 
Cf^ such that for all s G dffix) and x G Xq, 

(d) The function g{x,uj), for each uj G Hi and i G V, is 
defined and convex in x (not necessarily differentiable) 
over some open set that contains Xq. 

(e) The subgradients d G dg~^{x,Lu) are uniformly bounded 
over the set Xq. That is, there is a scalar Cg such that 
for all d G dg~^(x,uj), x G Xq, uj G Hi, and i GV, 

lirfll < Cg. 

By Assumption 12] the subdifferentials dffix) and dg^{x,uj) 
are nonempty over Xq. It also implies that for any i G V and 

x,y G Xq, 

\Mx)-fi{y)\<Cf,\\x-y\\, (12) 

and for any uj G Hi, i G V, and x,y G Xq, 

\g+{x,uj) - g"^{y,uj)\ < Cg\\x-y\\. (13) 

One sufficient condition for Assumption |2]c) and |2]e) is that 
the set Xq is compact. 

We also require the following two assumptions. 

Assumption 3: We assume that uoi^k G Hi are i.i.d. samples 
from some probability distribution on Hi and independent 
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across agents. Furthermore, each Fli is a finite set and each 
element of Fli is generated with nonzero probability, i.e., for 
any uj £ Fli and i G V 

Pr{uj I w e rJJ > 0 

Assumption 4: For all z £ V, there exists a constant c > 0 
such that for all x £ Xq 

dist^(a;, A”) < cE [(g+(x,w))^] , 

where the expectation is taken with respect to the set Fli. 

The upper bound in Assumption |4] is known as global error 
bound and is crucial for the convergence analysis of our 
method Sufficient conditions for this bound have 

been shown in and HOl . which require the existence of a 
Slater point, i.e., let Xq = {x \ go{x) < 0}, then there exists a 
point X such that go{x) < 0 and g{x,uj) < 0 for all w. When 
each function g{-,io) and po(’) is either a linear equality or 
inequality. Assumption |4] is called linear regularity and can be 
shown to hold by using the results in ED and ll42l (see also 

ES-Ell)- 

The inter-agent communication relies on the time-varying 
graph sequence Qk = (V,£^fc), for k > 0. A key assumptions 
on these communication graphs is the following: 

Assumption 5: There exists a scalar Q such that the graphs 
Ur=o Q-i ^k+e^ are strongly connected for all k > 0. 

Assumption |5] ensures that there exists a path from one 
agent to every other agent within any bounded interval of 
length Q. We say that such a sequence of graphs is Q-strongly 
connected. 


C. Main Results 

Our first proposition demonstrates the correctness of the 
algorithm (I7al)-(l7^. The end result is stated in the following 
proposition, which holds under the assumptions we have laid 
out above. 

The first proposition states a convergence result which holds 
under the Q-strongly connected time-varying sequence of 
doubly stochastic matrices {W4}. 

Proposition 1: Let Assumptions |7| - 0 and the stepsize 
conditions in hold. Then, the iterates {xi^k} generated by 
each agent i £ V via DAP in Algorithm Q] converge almost 
surely to the same point in the optimal set X* of i.e., for 
a random point x* £ X* 

lim Xi^k = X* for all i G V a.s. 

k—^oo ’ 

The second proposition states a convergence result which 
holds under the strongly connected time-varying sequence of 
row stochastic matrices {Wk}. 

Proposition 2: Let Assumptions \2i - 0 and the stepsize 
conditions in hold. Let Assumption 0 hold with Q = 1. 
Then, the iterates {xi^k} generated by each agent i G V via 
DAP in Algorithm 0 with the choice of weight in converge 
almost surely to the same point in the optimal set A'* of ®, 
i.e., for a random point x* G X* 

lim Xi^k = X* for all i G V a.s. 

k—¥C!0 ’ 


IV. Convergence Analysis 

In this section, we are concerned with demonstrating the 
convergence results stated in Proposition 0 and 0 First we 
review some lemmas from existing literature that are necessary 
in our analysis. 


A. Preliminary Results 

First we state a non-expansiveness property of the projection 
operator (see ll4^ for its proof). 

Lemma 1: Let X C R" be a nonempty closed convex set. 
The function ■ K." —^ X is nonexpansive, i.e., 

liriArM - nA’[2/]l| < Ik - y|| for all x,y G R”. 

In our analysis of the algorithm, we also make use of the 
following convergence result due to Robbins and Siegmund 
(see E71 Lemma 10-11, p. 49-50]). 

Theorem 1: Let {u^}, {uk}, {ak} and {bk} be sequences 
of non-negative random variables such that 

E.[vk+i\Xk\ < {I + ak)vk - Uk+ bk forallk>0 a.s. 


where Tk denotes the collection vq, ... ,Vk, uq, ..., Uk, 
ao,...,ak and bo,...,bk. Also, let J2T=o^k < oo and 
bk < oo a.s. Then, we have limfc_).oo Vk = v for a 
random variable u > 0 a.s., and Uk < oo a.s. 

In the following lemma, we show a relation of pi^k and 
Xi^k-i associated with any convex function h which will be 
often used in the analysis. For example, h{x) = ||x — a|p for 
some a G M" or h(x) = dist^(a;, A”). 

Lemma 2: Let Assumption 0 hold. Then, for any convex 
function h : R" —>■ R, we have 

KPi,k) < h{xi^k-i) 
iev iev 

Proof: The doubly stochasticity of the weights plays a 
crucial role in this lemma. From the definition of pi^k in (l7^ - 


^(Phk) < 

iev 


ievjev 

yy |yy[w^fc]ij j h{xj^k-i) 

jev Kiev ) 

yy^hixyk-f)- 

jev 


Lastly, for the convergence proof of our algorithm, we use 
a result from ll48l which shows the averaged iterates can still 
arrive at consensus if the errors behave nicely. 

Lemma 3: Let Assumptions 0 and 0 hold. Consider the 
iterates generated by 

=yy for i G V. (14) 

jev 

Let 9k denote the average of Oi^k for i G V, i.e., 9k = 
W Sigv Suppose there exists a nonnegative nonincreas¬ 
ing scalar sequence {ak} such that 

OO 

afc||ei,fc|| < oo, for all i € V. 
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Then, for all j £V 

oo 

'^akWOi^k - Oj,k\\ < oo. 

k^l 

Furthermore, for all i £ V and k > 1, 

\\0i,k - 4|| < ^7^^niax||6lj_o|| 

3 

k-1 N N 

E + nT. 

e=o j=i j=i 

where 7 and /3 are defined as 


B. Lemmas 

We need a series of lemmas for proving Proposition [T] and 
m We first state an auxiliary lemma that will be later used to 
relate two consecutive iterates Xi^k and Xi^k-i- This lemma 
can be shown by combining two existing results in B9l and 
lEol, but we include it here for completeness. 

Lemma 4: Let Assumptions ^ and |5] hold. Let the iterates 
{Pi k}, {ui k} kind {xi fc} be generated by the algorithm i\7a\l - 
m Then, we have almost surely for any x, z £ Xq, i £ V 
and k>l. 


\\xi,k - i|P < Ibi.fc - - ‘^ak{fi{z) - fi{x)) 

~ {9'^iPi,k,kJ^,k)Y 

+ f^\\p^,k - zf + Dr,rjal, 

where Dt.tj = (t + 4p + 1)C'?. and 77, r > 0 are arbitrary. 

Proof: In the light of ll4^ Theorem 1], we obtain from 
algorithm (17^ and Assumption |2e) 






(15) 


for any x £ Xq. We can rewrite g~^{vi^k,kOi^k) = 
ig~'~{vi^k,uJi^k) - g^{pi,k,i^i.k)) + g^{Pi,k,uJi^k)- Therefore, 


ig'''{vi,k,0Ji.k))‘^ 

> 2 p+(pi 7 ., (g+(ui,fe,Wi 7 .) - p+(pi7-,a;i7-)) 

+ {g'''iPi,k,kJt,k)y ■ (16) 

The first term on the right-hand side of (fTbl) can be further 
estimated as 


> -2Cg\\vi^k - Pt,k\\g~^iPi,k,oJi,k), (17) 

where the last inequality is from relation d. From the 
definition in dTbb and Assumption 12c), we further have that 

Pi,k\\9~^ {Pi,k') ^i^k) 

E ‘^^kCgC(pi^k-! ^i,k) 


<TalCp].+-{g~^{p,^k,uJi,k)) , 


where the last inequality is obtained by using 2 |a|| 6 | < ra^ + 
and r > 0 is arbitrary. Using relations (fT7])-(fT8]) in d, 
we obtain, 


{g'''{vi,k,^^i,k)y 

> -ralCpj. + ^1 - {g+{pi^k:^i,k)f ■ 

Hence, for all x £ Xq, 

\\xi,k - x\\^ <lb*,fc - - Ejtt {g^{Pi,k,<jJi,k))^ 


rCl 


+ rClal. 


(19) 


As the update rule in (fTbl) coincides with the algorithm in ll30ll . 
we can reuse another existing lemma ll30l Lemma 3]. That is, 
for any x, z £ Xq, we have 


\\vi,k - x\\'^ < \\Pi,k - i||^ - 2ak{fi{z) - fi{x)) 

X'^\\Pi,k — z\\^ + a1{l+4:g)Cf., ( 20 ) 

where p > 0 is arbitrary. Substituting this inequality in relation 

(fTOl l concludes the proof. ■ 

Since we use an approximate projection, we cannot guar¬ 
antee the feasibility of the iterates {xi^k} and {pi,k}- In the 
next lemma, we prove that {pi,k} and {xi^k} for all i G V 
asymptotically achieve feasibility. To this end, we define the 
following quantity: For alH G V and fc > 1, Zi^k is defined as 
the projection of pi^k on the feasible set X, i.e., 

Zi,k = ^x[Pi,k\- (21) 


Lemma 5: Let Assumptions □ -El hold. Let the sequence 
{ttfe} be such that ^ o®- Then, the iterates {pi,k'\ 

and {xi^k} generated by each agent 7 G V via method ^7al >- 
(O satisfy: 

OO 

(a) dist^(pz,fc, A') < (X) a.s. 

/f-i 

oo 

(b) '^Wxi^k - Zi^k\\‘^ < oo a.s. 

k^l 

where Zi^k A defined in (EB. 

Proof: We use Lemma |4] with x = z = Zi^k- Therefore, 
for any 7 . G z G V, and fc > 1, we obtain almost surely 


\\x^,k - Zi^kW 

■-J- _ ^ 1 - 

< \\Pz,k - Zi^kW'^ - —^ (p+(pj,fc,Wi,fe))" 


+ X Dr^rjCti^, 


( 22 ) 


where 79,-,, = (r -f 4p -|- i)Cj. and p, r > 0 are arbitrary. By 
the definition of the projection, we have 


dist(pi,fe, A”) = \\pi,k-Zi^k\\, and 

dist , X^ — n A" bZjA:] II f: II II ■ 

Upon substituting these estimates in relation (l22l i. we obtain 

dist^(xi7., A”) < dist^bi^fc, A”) - {g~'~{Pi,k,uJi,k))‘^ 


(18) 
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+ -Y) + Dr^rjal- (23) 

4r] 

Let denote the algorithm’s history up to time k. Taking 
the expectation conditioned on J^k-i and noting that pi ^ is 
fully determined by we have almost surely for any i G V 

and fe > 1 


E[dist^(a;i,fc, A") | Tk-i] 

< dist^(pi,fe, A”) - ^ 

rCg 


{9~''{pi,k,uJi^k)y I J^k-i 


+ ^dist^(p,,fc,d:’) + Dr^nal- 
477 


(24) 


Furthermore, choosing r = 4 ,77 = cCg and using Assumption 
m yield 


Lemma 6: Let Assumptions [ZE hold. Let the sequence 
{ak\ be nonnegative nonincreasing < 00 . Then, 

we have for all i GV 

00 

(a) ^ ||ej,fc|P < 00 as. 

k—1 

(b) limfe_>oo \\z^,k - Zk\\ = 0 a.s. 

00 

(c) '^ak\\z^^k-Zk\\<co a.s. 

k—1 

Proof: Part (a): From the relation (iTali-dT^. Ci^k in (l28l) 
can be viewed as the perturbation that we make on pi^k after 
the network consensus step Gill. Consider ||ei^fc||, for which 
we can write 

||t'Z,fc|| f: ^2,fell “t“ 11^2,fc P2,fc||. 


E [dist^(xi,fc, A”) I Tk-i] < dist^(pi,fc, A”) (25) 

- df) + (5 + 4cC'2)C'|.a^,. 

Finally, by summing over all i and using Lemma |2] with 
h{x) = dist^(a;, A”), we arrive at the following relation: 

y^E [dist^(xi,fc, A”) I Tk-i] < y^dist^(x»,fc-i, A*) 
iev iev 

~ + DNal, (26) 

where D = {5 + 4:cCg)C'j and C/ = max^gv C/i ■ Therefore, 
for all k > 1 , all the conditions of the convergence theorem 
(Theorem [T]) are satisfied and we conclude that 

00 

dist^(pi,fe, A") < 00 for alH S V a.s. (27) 

fc=i 

Lastly, from relation (|22]) and the chosen values for r and 77 , 
we obtain for any i gV and fc > 1 almost surely 

\\xi,k - ^ 2 ,fc|P - (^1 + + Dal. 

Therefore, in view of the result in (l27l i and 'Y7V=i^'k ^ 
the relation above implies 

00 

^\\xi^k-^i,k\\‘^ < 00 for alH e V a.s., 

k^l 

which is our desired result. ■ 

To complete the proof, we show in part (a) of the next 
lemma that the error due to the perturbations made after 
the consensus step ( Gal l, i.e., 

tti.k — Xi^k Pi,k: ( 28 ) 

eventually converges to zero for all i £ V. This will allow us 

to invoke Lemmaj^and show the iterate consensus. In part (b) 
of the next lemma, we show that the sequences arrive 

at consensus by converging to their mean Zk, i.e., for fc > 1 

^k = ^^Zi,k. (29) 

iev 

In part (c) of the next lemma, we show the network error term 
is summable. 


Applying (a + b)^ < 2a^ + 2b^ in the above inequality, we 
have 

||ei,fe||^ < 2\\xi^k - + 2dist^(pi,fc, A”). 

Summing this over k and using Lemma |5] we obtain the 
desired result. 

Part (b): By applying the inequality 2ab < a^ + b^ to each 
term in Q:fc||ei^fc|| and using Lemma| 6 ja), we further obtain for 
alH G V 

OO ^ OO ^ OO 

y]afc||e 2 ,fe|| < 2 +-y] ||ei,fef < oo a.s. (30) 

fc=i fc=i fc=i 

Using the relation above, (G^ and Xi^k = Pi,k + Si^k, we can 
invoke Lemma |3] with 0i^k = Xi^k- Therefore, it follows that 

OO 

^^ak\\xi^k - Xj^kW < OO for all z, j € V a.s. (31) 

k^l 

Furthermore, for all * G V and fc > 1, 

||a: 2 ,fc -Xk\\ < iV7/3''max||a;j,o|| (32) 

3 

k-1 N N 

+ tX] ^ X] l|ej,^+i|| + + I|e 2 ,fc||. 

1=0 j=l j=l 

From the fact that 0 < /3 < 1 and part(a), we know the first 
term and the last two terms on the right-hand side converge 
to zero. To show the second term also converges to zero, we 
will use the following result from ll50l Lemma 3.1(a)]. 

Lemma 7: Let Cfc be a scalar sequence. If limfc_^oo C,k = C 
and 0 < ^ < 1 , then limfc_^oo Y!1=o 

From this lemma and the result in part(a), we know that the 
second term on the right-hand side of (G2l i also converges to 
zero. Therefore, we have for all i G V 


lim \\xi^k - Xk\\ = 0. (33) 

fc—)-oo 

We next consider the term \\zi^k — Zk\\, for which by using 
^k = ji Y.eev ^Lk we have 


\Zik - Zk\\ = 


N 


'^{Zi,k - Zt,k) 


lev 


- - Zi^kW < ^^\\Pi,k - Pl,k 


lev 


lev 












where the first inequality is obtained by the convexity of the 
norm and the last inequality follows by the non-expansive 
projection property in Lemma[T] Furthermore, by using ||pi,fc — 
W.fcll < \\Pi,k-pk\\ + \\pi,k-pk\\, we obtain for every i e V 

\\z^,k-Zk\\ < \\Pi,k - PkW + ^'^\\p£,k - PkW- (34) 

^ i&v 


We next consider \\pi^k — Pk\\- By using the convexity of the 
norm and the fact that 0 < < 1, we obtain 


||Fz,fc Pk\\ — ^ ^1 Pk\\ 
jev 


lev 

last 


^j,k-i ^ xe^k-1 

rev 


where in the last equality we use pk = 

Iv “ Iv SfGV Therefore, 

by using the convexity of the norm again, we see 


\\Pi,k-Pk\\ < ;^EE (35) 

^ jGveev 

— ^ 'y ^ y ^ ~ Xk—i II + 1 ~ Xk—1 II) ■ 

^ jeveev 

Combining this relation with (l34l i and using the result in (l33T l. 
we obtain the desired result. 

Part (c): By using relation dSTl i in dlSl l. we obtain 


OO 

^afe|bi,fe -Pfell 

k^l 

OO 

(36) 

fc=i jev£ev 

Upon summing the relation (l34l i over f S V, we find 


yy ibi.fe - 2:fcii < 2^ ibj.fc -Pfcii- (37) 

lev iev 

Therefore, from (|36] | and (iTTl i. we obtain 

OO OO 

EE ak\\zi,k - Zk\\ < 2EE ak\\Pi,k -Pk\\ < OO, 

i^V k—1 iGV k—1 


which is the desired result. ■ 

In the next lemma, we use standard convexity analysis to 
lower-bound the term ~ /b^)) with a network 

error term and a global term. 

Lemma 8: Let Assumption^hold. Then, for all x £ X, we 
have 


- fiix))>-Cf'^\\z^,k - Zk\\ +fizk) - fix), 
iev iev 


where Cf = max^ evC'/i- 

Proof: Recall that /(x) = fiix)- Recall that 

^k = ^ Siev Using Zk and /, we can rewrite the term 
fiizi,k) - Mx) as follows; 


'^ifiiZi,k) - Mx)) 

lev 

= '^ifiiz^,k) - fiizk)) + ifizk) - fix)). 

iev 
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Furthermore, using the convexity of each function fi, we 
obtain 

'^ifiiZz,k) - Mzk)) > '^{Sz,k,Zi^k - Zk) 
iev iev 

iev 

where Si^k is a subgradient of fi at Zk- Since Zk is a convex 
combination of points Zi^k G <T C Xq, it follows that Zk S 
Xq. This observation and Assumption |2|c), stating that the 
subgradients of fiix) are uniformly bounded for x S Xq, yield 

^ “U/ ib*.fc “ ^feii> (39) 

iev lev 

where C/ = maxigyU/.. Therefore, from (l38]) and ( l39b . we 
have that 

- /*(*)) 

iGV 

> -U/ “ /(**)• 

iev 


C. Proof of Proposition Q] 

We invoke Lemma |4] with z = Zi^k = P\x[Pi,k\, r = 4 
and rj = cC^. We also let x = x* for an arbitrary x* G X*. 
Therefore, for any x* G X*, i G V and fe > 1, we almost 
surely have 

\\xi,k-x*\f < Ibi,fe - x*||^ - 2akifiizi,k) - fiix*)) (40) 
- ^ ig^iP^,k,uJi,k)Y 

+ ^^^2 Tist^(pi,fe,-T) + (5 + 4cC'g)C'y.a^. 

Let Xk denote the algorithm’s history up to time k. i.e., 

^k — \^Xip, iuJi^t, 1. fi t "f. k), i G V}, 

and iF={xifi, i G V}. By taking the expectation conditioned 
on Xk-i in the above relation and summing this over i G V, 
we obtain 


iev 


< E Up*"'' “ ^*11 ^ “ 2afe^(/i(z*,fc) - fiix*)) 

iev iev 


~ JF^ E ^ {9t(p^,k,‘^^,k)y I j^k-1 

9 iev 


1 


4^^.^ .GV 


^ dist^(p,,/,, X) -h DNal, 


where D = (5 + 4:cCg)Cj with C/ = max igv U/^. Now we 
use Lemma |2] with /i(x) = ||x — x*|p. Assumption |4] and 
Lemma 0 with i = x* to further estimate the terms on the 
right-hand side. From these, obtain almost surely for any k > 1 
and X* G X*, 


^E[||x,,fc-x*f |,Ffc_i] 
iGV 


( 38 ) 
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< “ ^*11^ “ 2afc ^(/(zfe) - f{x*)) 

lev iev 


1 


y^dist^(pi,fc, A’) 
iev 


+ 2akCf ^ ^ ~ -Zfcll + DNaf.. 

iev 


Since Zk € A”, we have f{zk) — f{x*) > 0. Thus, under the 
assumption J2T=o '^fc ^ Lemma | 6 jc), the above rela¬ 

tion satisfies all the conditions of the convergence Theorem [T| 
Using this theorem, we have the following results. 

Result 1: The sequence {X^zgv ~ ^*11} convergent 
a.s. for every x* S A”*. 

Result 2: For every x* € A”*, 

OO 

akifizk) - /(^*)) < OO a.s. 

/c=l 


From Result 1 and Lemma Qb), we know that the sequence 
{Sigv convergent a.s. for every x* G A"*. This 

and Lemma| 6 jb) imply that \\zk — a;*|| is also convergent a.s. 
for every x* G A"*. From Result 2, = oo, and the 

continuity of /, it follows that the sequence {zk} must have 
one accumulation point in the set X* a.s. This and the fact 
that {||zfe — a;*||} is convergent a.s. for every x* G X* imply 
that for a random point x* G X*, 

lim Zk = X* a.s. (41) 

k—^oo 


We now prove the following claim: For all z € V 

lim Xi^k = X* a.s. (42) 

k—¥oo ’ 

Consider 

X II ^ II^z,fc|| A II^ 2 A ^k\\ A ll^fc X II. 

From Lemma |3b). Lemma | 6 l|b) and (HTt . all the terms on the 
right-hand side converge to zero a.s. Therefore, it is obvious 
that claim (l42li holds, which is our desired result. 


D. Proof of Proposition |2] 

The line of proof is similar to that in Proposition [T] 
Therefore, we only lay down the differences. 

Note that the use of row stochastic matrices in (l9]l results 
in “biased” consensus, which is related to the left-eigenvector, 
see e.g., ED. The following lemma states this well-known 
result. 

Lemma 9: Let Assumption |5] hold with Q = 1. Then, for 
any fc > 1, there exists a normalized left-eigenvector TTfc G M.^ 
such that 


T^kWk = TtJ. 


Also, in the proof we consider the following weighted averages 
rather than the true averages Xk, pk and Zk- 

Xk='Yy^k\iXi,k, pk='Yy^k\iPi,k, ( 44 ) 
iev iev 

and Zk = y^\nk\%Zi,k- 
lev 

First, notice that Lemma |4] still holds in this case as it does 
not require Assumption [D 

Changes in Lemma |5} Combining with the results in ISl . 
HD, ifS^ - ESl . Lemma ID still holds in this case by replacing 
Ok with 0k = 'Yl,iev\-'’^k\iOi,k and re-defining the constants 7 
and (3 as 



If in addition every graph in {Gk} is regular, then we have 

7=\/2, /3 = min|l-^^, maxcr2(FLfc)| , 

where (T 2 {Wk) is the second largest singular value of Wk. 

Changes in Lemma |5} By multiplying [ 7 rfc]i to (l25T l and 
summing over z € V, we obtain 

y^E [[7rfe]jdist^(a;j,fc, A") | Xk-i] < y^[7rfc],dist^(p7fc,-L) 

iGV iev 

~ y][7rfc]idist^(p,,fc,d:’) A DNal. (45) 

iev 

From the definition of pi^k in (fTal i and the convexity of the 
distance function, we have 

y^[7rfe]idist^(pi,fe, A”) < y^y^[7rfc]i[W^fc]ijdist^(xj,fc-i, A*) 
lev lev jev 

- y^[7’'fc]jdist^(xj-,fc-i,A*), 
iev 

where the last inequality follows from (l4D i. Combining this 
result with (l45l l. we obtain 

y^E [[Trk]idist'^{xi^k,X) \ Xk-i] < y^[7rfe]idist^(x7fc_i, A”) 
iev iev 

~ TT^yiWidistypi.fe,'^) -f 
^^^9 iev 

in which all the conditions of Theorem [D holds. Hence, all 
the remaining results follow immediately. 

Changes in Lemma | 6 } All the results still hold by replacing 
Zk, Pk and Xk with Zk, pk and Xk, respectively. Especially, 
from relation ( |4D ) we have 


Moreover, [Tr^Ji > 0 for all z G V. 

Using the definition of in Lemma |9] we have 

= y] [^k]i[Wkh , for all j G V. (43) 

iGV 


Pk — ^ ^ [EFfclijfe—1 — ^ 1 — Xk—i 

iev jev jev 

and all the results follow immediately. 
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Changes in Proposition Q} By multiplying ['Kk\i to ( l40l i. 
summing this over i £ V and considering fi{x) = x, 
we have 

- x*\\‘^ < ^[7tk]i\\Pi,k - x*\\'^ 

iev iev 

- {g^ipr,k,i^i,k)f‘ 

9 iGV 

+ '^\-^k]idist'^{pi,k, X) + DNal, 

^^9 igv 

where we used the fact that = Zk- 

Now we use (|4^ and Assumption |4] to obtain almost surely 
for any fc > 1 and x* £ X*, 

y^E [[7rfc]i||a:i,fc - x*||^ | Xk-i] 

iev 

< - ‘^^a^{zk - x*) 

iGV 

*ev 

Since Zk £ X, we have [zk — cc*) > 0. Thus, under the as¬ 
sumption ^ tx) and Lemma|5la), the above relation 

satisfies all the conditions of the convergence Theorem [T] 
Using this theorem, we have the following results. 

Result 1: The sequence ~ Hi ts conver¬ 

gent a.s. for every x* £ X*. 

Result 2: For every x* £ X*, 

OO 

'^^akO^{zk — X*) < oo a.s. 
k=l 

From Result 1 and Lemma |5tb), we know that the sequence 
is convergent a.s. for every x* £ X*. 
This and Lemma |6lb) imply that ||£fe —a;*|| is also convergent 
a.s. for every x* £ X*. From Result 2, = oo, it 

follows that the sequence {zk} must have one accumulation 
point in the set X* a.s. This and the fact that {||zfc — a;*||} is 
convergent a.s. for every a:* £ X* imply that for a random 
point X* £ X*, 

lim Zk = X* a.s. 

k—¥oo 

The remaining results follow immediately. 

V. Simulation Results 

In this section, we provide a numerical example showing 
the effectiveness of the proposed decentralized approximate 
projection algorithm. We consider optimal gossip averaging 
which is an example of decentralized optimization. 

In many decentralized algorithms, gossip based commu¬ 
nication protocols are often used. In these communication 
protocols, only one agent randomly wakes up at a time (say 
agent i) and selects one of its neighbors (say agent j) with 
probability . Then, the two agents exchange their current 
information through the link {i,j) and take the average. Let 
A{i,j) denote the averaging matrix associated with the link 


(i, j). For example, the averaging matrix A(l, 2) of a 4-agent 
network system looks like 


A(l, 2 ) 


1/2 1/2 0 0 
1/2 1/2 0 0 
0 0 10 
0 0 0 1 


Note that the expectation of the averaging matrix A can be 
represented as E[A] = 

Let P denote the probability matrix whose component of 
the *-th row and j-th column is pij. Our goal here is to 
find an optimal probability matrix P* associated with the 
current communication graph, which is time-invariant and 
connected, in a decentralized fashion. The convergence speed 
of the gossip protocol is known to be inversely proportional 
to A 2 (E(A)), which is the second largest eigenvalue of the 
expected averaging matrix E[A] (see 1561). Thus, the optimiza¬ 
tion problem of hnding the fastest averaging distribution P* 
can be formulated as the following SDP; 


min s 

S,P 

(46a) 

s.t. Y PvM'i-J) - ll"'' ^ si 

(46b) 



Pij>0, Pij = 0 if ii,j) ^ £ 

(46c) 

YPio = 1 ’ ^ 

(46d) 

iev 



An optimal P* of the problem (I46al i- (l46dl) computed in a 
centralized fashion is not useful as gossip protocol is usually 
required in a decentralized setting. A decentralized method 
has been proposed in ll5^ . but the method only hnds a 
suboptimal solution. Using our proposed algorithm, we can 
find the optimal solution of (I46al i- (l46db in a decentralized way. 

With a slight abuse of notation, let X = [sF]. In this 
problem, all agents share the same local objective function, 
i.e., fi{x) = s for all * £ V, whereas each agent i has a local 
constraint set A) = X^ H X^ where 

Xl = {x\ PijA{i,j) - 11^ A si}, 
ij'ev 


Xi ={x\ YPd = 1 ’ Pv ^ 0 ’ Pv =0 if j ^ Afz}. 
iev 

At each iteration of our algorithm, we randomly select a 
component constraint from A) and make a projection. More 
specifically, we approximate the projection onto the SDP 
constraint A”/ using the equation (fTTTi . Note that the constraints 
(I46cl i- (l46dl i. which are distributed among agents, will guar¬ 
antee the structure of the underlying communication graph. 
Therefore, agents do not require knowledge on the whole 
graph structure. 

We note that due to the compactness of the set A”/, the 
problem (I46al i- (l46dl i satisfies Assumption |2] and the optimal 
solution set X* is nonempty. Assumptions [T] |3] and |5] can be 
satisfied by construction. Assumption |4] is also satisfied as all 
inequalities are affine in this case. 

We let all agents terminate if their solution is within 0.01% 
of the global average and the total feasibility violation is less 
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TABLE I 

Number of iterations for all network agents to converge 
WITHIN 0.01% OF THE GLOBAL AVERAGE 



clique 

cycle 

Star 

Af = 4 

2,170 

2,819 

7,190 

Af = 15 

2,179 

8,280 

18,541 


than 0.001. We say the algorithm has converged only when all 
network agents terminate. Note that this global average based 
criterion is just used for the sake of simulations. In a real 
setting, we can change the termination criterion in such a way 
that each agent stops based on its local average. For example, 
each agent can keep track of the iterates of its neighboring 
agents and terminate if its own solution is within 0.01% of the 
local average. Also, due to the randomness of our algorithm, 
we repeat all the simulations for 10 times and report their 
averages. 

Table U summarizes the simulation results. It shows the 
number of iterations until convergence for different numbers 
of agents {N) and underlying communication topologies (Gk)- 
In the experiment, we use 4,15 agents with three different 
network topologies, namely clique, cycle and star. Note that for 
this problem the underlying network must be time-invariant, 
i.e., Gk = G for fe > 1, as the gossip algorithm in ll5^ is 
built on a fixed undirected graph. As expected, the star graph 
takes the most iterations for both N = 4,15. Also, when 
there are more agents in the network, the algorithm takes more 
iterations. 

VI. Conclusion 

We have studied a distributed optimization problem defined 
on a multiagent network which involves nontrivial constraints 
like LMIs. We have proposed a decentralized algorithm based 
on random feasibility updates, where we approximate the 
projection with an additional subgradient step. The proposed 
algorithm is efficiently applicable for solving any distributed 
optimization problems which involve lots of computationally 
prohibitive constraints, for example, decentralized SDPs. We 
have established the almost sure convergence of our method 
under two different assumptions on the sequence of weight 
matrices {Wk}, namely doubly stochastic {Wk} over a Q- 
strongly connected sequence of digraphs and row stochastic 
{Wk} over a strongly connected sequence of digraphs. We 
have performed experiments on an optimal gossip averaging 
problem to verify the performance and convergence of the 
proposed algorithm. 
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